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Abstract — Reconfigurable computing refers to the use of 
processors, such as Field Programmable Gate Arrays (FP- 
GAs), that can be modified at the hardware level to take 
on different processing tasks. A reconfigurable comput- 
ing platform describes the hardware and software base on 
top of which modular extensions can be created, depending 
on the desired application. Such reconfigurable computing 
platforms can take on varied designs and implementations, 
according to the constraints imposed and features desired 
by the scope of applications. This paper introduces a PC- 
based reconfigurable computing platform software frame- 
works that is flexible and extensible enough to abstract 
the different hardware types and functionality that differ- 
ent PCs may have. The requirements of the software plat- 
form, architectural issues addressed, rationale behind the 
decisions made, and frameworks design implemented are dis- 
cussed. 

Keywords — reconfigurable computing, software platform, 
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I. Introduction 

Computer processors have for many years been designed 
based on the von-Neumann or Harvard architectures. Soft- 
ware to be run on these processors are compiled into a set 
of processor-specific instructions, which are loaded during 
run-time and executed sequentially. Such sequential pro- 
cessing of an instruction every few clock cycles works well 
enough for typical PC applications such as text editors, 
which have low data processing requirement. 

However, PCs are also often used for computationally in- 
tensive high-throughput data processing, especially in sci- 
entific research work. The sequential nature of the typical 
PC processor, such as the Intel Pentium, becomes a major 
processing bottleneck in such situations. The solution to 
this problem has been to use processors with greater clock- 
speeds, or to network several of these PCs together into a 
cluster or computational grid pQ. 

More recently, there has been an increasing interest in 
the use of reconfigurable hardware chips for such compu- 
tationally and data intensive processing. These chips, such 
as Field Programmable Gate Arrays (FPGAs), possess a 
fundamentally different architecture from the typical von- 
Neumann or Harvard type processors. The algorithms to 
be executed are normally defined in a hardware descrip- 
tion language and compiled into a bitstream, which will 
be downloaded to the FPGA as and when use of the algo- 
rithm is desired. This bitstream download will reconfigure 
the hardware logic on the FPGA accordingly, allowing data 
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passed into the FPGA to be processed in hardware, in par- 
allel. 

Several reconfigurable computing research projects |2] 
0] focus on developing new, improved designs of reconfig- 
urable chips. Other groups [5] [S] [7] utilize off-the-shelf 
FPGAs, such as those from Xilinx |SJ, and work on is- 
sues such as logic placement and routing optimization [3]. 
Project Proteus ^Dj was initiated by the DSP Technology 
Centre of NgeeAnn Polytechnic (Singapore) to develop a 
low-cost FPGA-based reconfigurable computing platform 
for typical PCs, with off-the-shelf hardware components 
and a portable software platform layer, that is flexible and 
extensible enough to abstract the different hardware types 
and functionality that different PCs may have. This pa- 
per discusses the requirements and design of this software 
platform. 

Section|n]describes the requirements of the Proteus Soft- 
ware Platform, Section lTTll discusses the architectural issues 
addressed and the design of the software platform, Section 
IIVI explains how the software platform deploys algorithms 
to available hardware, and finally Section concludes this 
paper. 

II. Requirements of the Proteus Software 
Platform 

To understand the architectural design of the software 
platform, it will be useful to first discuss the requirements 
imposed by the desired use and level of flexibility of the 
platform. 

Firstly, the goal of the project has been to develop a 
PC-based reconfigurable computing platform. PCs run a 
variety of operating systems (OS), such as Microsoft Win- 
dows and Linux. It is therefore desirable for the software 
platform to be portable across various OS environments. 

Secondly, being PC-based also brings the advantage of 
being able to utilize the various PC resources, such as plen- 
tiful RAM and harddisk storage space, and network con- 
nectivity. The software platform must be able to abstract 
access to sink / source data from these resources. On top 
of that, there must also be the possibility of using several 
FPGA chips concurrently (which may exist on several dif- 
ferent PCI boards). 

Thirdly, the high level of variability of available numbers 
and types of PC resources as well as reconfigurable proces- 
sors means that the software platform has to be highly 
modular, with hardware abstraction modules that can be 
dynamically loaded according to the available resources. 

Fourthly, this wide resource variation also has an im- 
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plication on the deployability of algorithms - certain algo- 
rithm implementations may be suitable for execution only 
on certain processor types, eg) a reconfigurable hardware 
bitstream compiled for a Xilinx Virtcx FPGA cannot be 
downloaded to an Altera Stratix FPGA, though both 
chips may exist in the same PC. The software platform will 
therefore have to match the available hardware types with 
the available compatible algorithm implementations. 

Finally, all this need for flexibility in the software plat- 
form of being able to load different hardware abstraction 
and algorithm implementation modules means that such 
modules should be easily created in a high-level language 
that most programmers are familiar and comfortable with. 

III. Architecture of the software platform 

Considering the requirements set out in Section [HJ a 
high-level and modular software platform frameworks was 
designed. 

The requirements for portability across OS environ- 
ments, modularity of extensions, and ease of programma- 
bility, led to the Java language being selected for imple- 
mentation of the software platform. This allows the soft- 
ware platform to be run on any computer that has a Java 
Virtual Machine (JVM) installed, while the high-level and 
object-oriented nature of the language satisfies the require- 
ments of dynamically loadable modules that can be easily 
programmed in a widely-adopted language. 

To modularize its functionality, the software platform 
has been divided into four main component blocks: the 
Proteus Software Platform (PSP) core, which holds the 
common set of interfaces and functionality, and three other 
components: the Proteus Application, Hardware Abstrac- 
tion Modules (HAMs), and Software Modules, that are de- 
ployed according to the available functionality on the PC 
and the desired application. The use of Java allows each of 
these component modules to be distributed as individual 
JAR files. This segmentation is illustrated in Figure ^ and 
described in greater detail below. 
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Fig. 2 

A TYPICAL ALGORITHM BLOCK REPRESENTATION 



Each of these blocks is usually of a processor-specific 
implementation, such as a compiled Java class, or an FPGA 
hardware implementation bitstream. 

However, the Proteus Software Platform is intended to 
be run in environments where the available processor types 
are variable and determined only during run-time, and 
where Algorithms may have a number of implementations 
for different processor types. 

Hence there is a need for a different Algorithm structure, 
one which allows for a high level description of the connec- 
tivity between Algorithm blocks, while allowing each block 
to have multiple implementations for the various processor 
types. 

The resulting design takes on a 'shell/implementation' 
architecture, as shown in Figure In this structure, the 
Algorithm 'shells' are connected up to one another, and 
define the input/output data types. A 'shell' can be associ- 
ated with multiple 'implementations', each of which is com- 
patible with a different processor type (such as an FPGA 
or JVM). 
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Fig. 1 

Components of the Proteus Software Platform 



Fig. 3 

The Algorithm 'shell/implementation' structure 



A. Software Modules 

An Algorithm block defines a unit of operations that 
receives data at an input, processes it, and sends the results 
out through an output. This is commonly represented by 
a block as shown in Figure [3 



Connecting up a number of 'shells' will therefore create a 
high level data flow graph, ensuring that data will be passed 
correctly from one algorithm to the next, independent of 
where the associated 'implementations' are deployed. This 
is illustrated in Figure 01 
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Fig. 4 

Connecting up a number of Algorithm 'shells' to form a 
data flow graph 



B. Hardware Abstraction Modules (HAMs) 

The need for the ability of the software platform to utilize 
various kinds of processor types and other PC resources 
implies a need to define a common layer of abstraction 
to all these resources. This abstraction layer must provide 
information on the type of Algorithm implementations that 
are compatible with corresponding physical hardware, as 
well as whether a compatible Algorithm implementation 
can be deployed to that hardware (eg, if the processor is 
not already overloaded). 

The abstraction layer designed to satisfy the above re- 
quirements consists of modelling the desired properties of 
one or more physical hardware resources in one or more 
'virtual processor' entities, as shown in Fig 03 
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C. Proteus Application 

The Proteus Application serves two purposes - it pro- 
vides an administrative interface to the end-user, and de- 
fines the mechanism by which data is passed from one Al- 
gorithm to another. 

The administrative interface allows the end-user to per- 
form such operations as starting / stopping the platform 
or selecting the desired algorithm for download. 

The data passing mechanism is defined in the Proteus 
Application because various techniques exist, such as Com- 
municating Sequential Processes (CSP) [12] and Dataflow 
Process Networks (PN) ^2] , and utilization of a particular 
mechanism is application-dependent. 

Figure [fj] shows the set-up of Processors and Algo- 
rithms, with the Algorithmlmplementations deployed to 
corresponding Processors. The portions concerning data- 
exchange, which have to be implemented by the Proteus 
Application, are marked accordingly. 
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Fig. 6 

Data exchange mechanism implemented by Proteus 
Application 



Fig. 5 

Abstraction of physical hardware via 'Virtual Processors' 



These 'virtual processors' will be queried by the software 
platform to determine the compatibility and deployability 
of a particular Algorithm implementation, as described in 
detail in Section IIVI 

For a particular physical hardware resource (such as an 
FPGA processor board, or a storage media), the 'virtual 
processor' is part of a larger package called the 'Hardware 
Abstraction Module' (HAM), which is a distribution JAR 
of all the hardware-specific components (such as the inter- 
faces to the software platform, and the OS-specific device 
drivers). 



IV. Deployment of Algorithms 

For the software platform to perform the tasks of match- 
ing Algorithm implementations with virtual processors, a 
technique of tagging both of these with some common form 
of type compatibility identification is needed. This tagging 
should offer the ability to define different levels of com- 
patibility, such as that at a specific chip model or at a 
higher family level. For example, an FPGA Algorithm im- 
plementation may be compatible with only the Xilinx Vir- 
tex XCV100 chip, or may be compatible with all chips in 
the Virtex family, and should be allowed to be tagged as 
such. 

The tagging mechanism designed consists of a string in 
the general form " type. make. familymodel. otherlnfo" , that 
can have any number of descriptors separated by dots 
("."), depending on the level at which an Algorithm im- 
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plcmcntation or virtual processor is specific. For exam- 
ple, an Algorithm implementation that can be downloaded 
to a Xilinx Virtex series XCV100 chip may be tagged 
"fpga.xilinx.virtex.xcvlOO" , while a virtual processor that 
accepts all Xilinx Virtex Algorithm implementations may 
have that of " fpga. xilinx. virtex" . The specificity of a tag 
increases with the number of descriptors. Such a scheme 
can be illustrated as a tree graph, as shown in Figure [7] 
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Algorithm implementation / virtual processor tagging 
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In this tree graph, the least specific descriptor is at 
the top - in this case the 'type' level, with the descrip- 
tor value "FPGA". Moving down one level introduces the 
next more specific 'make' descriptors, so a tag here may 
be "FPGA.Xilinx". When the Proteus Software Platform 
tests whether an Algorithm implementation is compatible 
with a virtual processor, it needs to only ensure that the 
tag of the virtual processor is located at the same point 
on the tree, or is an ancestor of that of the Algorithm 
implementation. That is, a more specific (lower in the 
tree) Algorithm implementation can only be deployed to 
an equal or less specific virtual processor (equal or higher 
in the tree). For example, an Algorithm implementation 
tagged " FPGA.Xilinx. Virtex.XCVlOO" is compatible with 
a virtual processor of type " FPGA. Xilinx. Virtex" , but not 
" FPGA.Xilinx. Virtex.revB" . 

For each Algorithm to be deployed, the software platform 
runs through the list of available Algorithm implementa- 
tions and virtual processors to identify those that are com- 
patible. Once a match is found, the virtual processor is 
queried if the matching Algorithm implementation can be 
deployed to it. This deployability step is necessary to test 
if the associated hardware has the necessary capacity to 
run the compatible Algorithm implementation, e.g. if an 
FPGA has sufficient available space. If not, the process 
is repeated till a match that is both compatible and de- 
ployable is found. This flow for Algorithm deployment is 
illustrated in Figure |H1 



Fig. 8 

Algorithm deployment flow 



V. Conclusion 

The software platform designed satisfies the require- 
ments for a high-level, portable reconfigurable computing 
platform frameworks that is highly modular and flexible 
enough to utilize the varied resources available on different 
PCs. 
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